Search CORE

76 research outputs found

Predicting software project effort: A grey relational analysis based method

Author: Albrecht
Boehm
Boehm
Boetticher
Breunig
Briand
Briand
Briand
Burgess
Cheung
Deng
Deng
Finnie
Huang
Huang
Jain
Jeffery
Jiang
Jou
Jørgensen
Kadoda
Kemerer
Khotanzad
Kohavi
Li
Liu
Luo
Martin Shepperd
Mitchell
Moløkken
Mukhopadhyay
Myrtveit
Putnam
Qinbao Song
Samson
Shepperd
Siedelecki
Song
Srinivasan
Su
Walkerden
Wang
Wang
Wang
Wittig
Wittig
Publication venue: 'Elsevier BV'
Publication date: 01/06/2011
Field of study

This is the post-print version of the final paper published in Expert Systems with Applications. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2011 Elsevier B.V.The inherent uncertainty of the software development process presents particular challenges for software effort prediction. We need to systematically address missing data values, outlier detection, feature subset selection and the continuous evolution of predictions as the project unfolds, and all of this in the context of data-starvation and noisy data. However, in this paper, we particularly focus on outlier detection, feature subset selection, and effort prediction at an early stage of a project. We propose a novel approach of using grey relational analysis (GRA) from grey system theory (GST), which is a recently developed system engineering theory based on the uncertainty of small samples. In this work we address some of the theoretical challenges in applying GRA to outlier detection, feature subset selection, and effort prediction, and then evaluate our approach on five publicly available industrial data sets using both stepwise regression and Analogy as benchmarks. The results are very encouraging in the sense of being comparable or better than other machine learning techniques and thus indicate that the method has considerable potential.National Natural Science Foundation of Chin

Crossref

Brunel University Research Archive

Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation

Author: Albrecht
Austin
Baird
Batista
Boehm
Boehm
Breiman
Briand
Briand
Briand
Brockmeier
Cartwright
Cheung
Clark
Feelders
Finnie
Gama
Gray
Holte
Jain
Jeffery
Jun Liu
Jönsson
Kemerer
Khotanzad
Kibler
Kim
Kitchenham
Kohavi
Little
Little
Little
Little
Little
Martin Shepperd
Miranda
Myrtveit
Pickard
Putnam
Qinbao Song
Quinlan
Robins
Rubin
Rubin
Rubin
Rubin
Samson
Selby
Shao
Shepperd
Shepperd
Siedelecki
Song
Song
Srinivasan
Strike
Tabachnick
Tay
Walkerden
Walston
Xiangru Chen
Publication venue: 'Elsevier BV'
Publication date: 01/12/2008
Field of study

Missing data is a widespread problem that can affect the ability to use data to construct effective prediction systems. We investigate a common machine learning technique that can tolerate missing values, namely C4.5, to predict cost using six real world software project databases. We analyze the predictive performance after using the k-NN missing data imputation technique to see if it is better to tolerate missing data or to try to impute missing values and then apply the C4.5 algorithm. For the investigation, we simulated three missingness mechanisms, three missing data patterns, and five missing data percentages. We found that the k-NN imputation can improve the prediction accuracy of C4.5. At the same time, both C4.5 and k-NN are little affected by the missingness mechanism, but that the missing data pattern and the missing data percentage have a strong negative impact upon prediction (or imputation) accuracy particularly if the missing data percentage exceeds 40%

Crossref

Brunel University Research Archive

Macrophage polarization states in atherosclerosis

Author: Huimei Sun
Jiayong Wu
Pengyu Zhou
Qinbao Peng
Shaoyi Zheng
Shengping He
Sikai Chen
Songlin Du
Xiu Liu
Xuefeng Lin
Zhengkun Song
Publication venue: 'Frontiers Media SA'
Publication date: 01/05/2023
Field of study

Atherosclerosis, a chronic inflammatory condition primarily affecting large and medium arteries, is the main cause of cardiovascular diseases. Macrophages are key mediators of inflammatory responses. They are involved in all stages of atherosclerosis development and progression, from plaque formation to transition into vulnerable plaques, and are considered important therapeutic targets. Increasing evidence suggests that the modulation of macrophage polarization can effectively control the progression of atherosclerosis. Herein, we explore the role of macrophage polarization in the progression of atherosclerosis and summarize emerging therapies for the regulation of macrophage polarization. Thus, the aim is to inspire new avenues of research in disease mechanisms and clinical prevention and treatment of atherosclerosis

Directory of Open Access Journals

Towards Online Multiresolution Community Detection in Large-Scale Networks

Author: A Arenas
A Arenas
A Clauset
A Clauset
A Lancichinetti
A Lancichinetti
A Lancichinetti
A Lancichinetti
DJ Watts
F Luo
F Radicchi
G Palla
Heli Sun
J Chen
J Huang
J Leskovec
Jianbin Huang
JM Kleinberg
JP Bagrow
JP Bagrow
K Hajra
M Girvan
MEJ Newman
MEJ Newman
P Ronhovde
Qinbao Song
R Guimerà
S Fortunato
S Fortunato
Tim Weninger
VD Blondel
WW Zachary
X Xu
Yaguang Liu
Yamir Moreno
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The investigation of community structure in networks has aroused great interest in multiple disciplines. One of the challenges is to find local communities from a starting vertex in a network without global information about the entire network. Many existing methods tend to be accurate depending on a priori assumptions of network properties and predefined parameters. In this paper, we introduce a new quality function of local community and present a fast local expansion algorithm for uncovering communities in large-scale networks. The proposed algorithm can detect multiresolution community from a source vertex or communities covering the whole network. Experimental results show that the proposed algorithm is efficient and well-behaved in both real-world and synthetic networks

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

A novel feature subset selection algorithm based on association rule mining

Author: Guangtao Wang
Qinbao Song
Publication venue: 'IOS Press'
Publication date
Field of study

Crossref

A multi-label learning based kernel automatic recommendation method for support vector machine.

Author: Qinbao Song
Xueying Zhang
Publication venue: Public Library of Science (PLoS)
Publication date
Field of study

Choosing an appropriate kernel is very important and critical when classifying a new problem with Support Vector Machine. So far, more attention has been paid on constructing new kernels and choosing suitable parameter values for a specific kernel function, but less on kernel selection. Furthermore, most of current kernel selection methods focus on seeking a best kernel with the highest classification accuracy via cross-validation, they are time consuming and ignore the differences among the number of support vectors and the CPU time of SVM with different kernels. Considering the tradeoff between classification success ratio and CPU time, there may be multiple kernel functions performing equally well on the same classification problem. Aiming to automatically select those appropriate kernel functions for a given data set, we propose a multi-label learning based kernel recommendation method built on the data characteristics. For each data set, the meta-knowledge data base is first created by extracting the feature vector of data characteristics and identifying the corresponding applicable kernel set. Then the kernel recommendation model is constructed on the generated meta-knowledge data base with the multi-label classification method. Finally, the appropriate kernel functions are recommended to a new data set by the recommendation model according to the characteristics of the new data set. Extensive experiments over 132 UCI benchmark data sets, with five different types of data set characteristics, eleven typical kernels (Linear, Polynomial, Radial Basis Function, Sigmoidal function, Laplace, Multiquadric, Rational Quadratic, Spherical, Spline, Wave and Circular), and five multi-label classification methods demonstrate that, compared with the existing kernel selection methods and the most widely used RBF kernel function, SVM with the kernel function recommended by our proposed method achieved the highest classification performance

Public Library of Science (PLOS)

Directory of Open Access Journals

A weighted voting-based associative classification algorithm

Author: Jia Zihan
Song Qinbao
Zhu Xiaoyan
Publication venue
Publication date: 10/05/2022
Field of study

Thư viện trường Đại học Đà Lạt

A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data

Author: Guangtao Wang
Jingjie Ni
Qinbao Song
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref